Overview

Dataset statistics

Number of variables28
Number of observations10000
Missing cells19391
Missing cells (%)6.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory12.1 MiB
Average record size in memory1.2 KiB

Variable types

CAT14
NUM11
BOOL3

Reproduction

Analysis started2020-11-05 17:43:15.863490
Analysis finished2020-11-05 17:44:21.404337
Versionpandas-profiling v2.6.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml
emp_title has a high cardinality: 8183 distinct values High cardinality
Notes has a high cardinality: 6761 distinct values High cardinality
purpose has a high cardinality: 5677 distinct values High cardinality
zip_code has a high cardinality: 720 distinct values High cardinality
earliest_cr_line has a high cardinality: 463 distinct values High cardinality
emp_title has 592 (5.9%) missing values Missing
Notes has 3230 (32.3%) missing values Missing
mths_since_last_delinq has 6316 (63.2%) missing values Missing
mths_since_last_record has 9160 (91.6%) missing values Missing
delinq_2yrs has 8910 (89.1%) zeros Zeros
inq_last_6mths has 4602 (46.0%) zeros Zeros
mths_since_last_delinq has 163 (1.6%) zeros Zeros
mths_since_last_record has 267 (2.7%) zeros Zeros
revol_bal has 278 (2.8%) zeros Zeros
revol_util has 254 (2.5%) zeros Zeros

Variables

Id
Real number (ℝ≥0)

UNIFORM
UNIQUE
Distinct count10000
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5000.5
Minimum1
Maximum10000
Zeros0
Zeros (%)0.0%
Memory size78.2 KiB

Quantile statistics

Minimum1
5-th percentile500.95
Q12500.75
median5000.5
Q37500.25
95-th percentile9500.05
Maximum10000
Range9999
Interquartile range (IQR)4999.5

Descriptive statistics

Standard deviation2886.89568
Coefficient of variation (CV)0.5773214038
Kurtosis-1.2
Mean5000.5
Median Absolute Deviation (MAD)2500
Skewness0
Sum50005000
Variance8334166.667
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[1.e+00 1.e+04], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
2047 1 < 0.1%
 
5424 1 < 0.1%
 
1338 1 < 0.1%
 
7481 1 < 0.1%
 
5432 1 < 0.1%
 
9526 1 < 0.1%
 
3379 1 < 0.1%
 
1330 1 < 0.1%
 
7473 1 < 0.1%
 
9518 1 < 0.1%
 
Other values (9990) 9990 99.9%
 
ValueCountFrequency (%) 
1 1 < 0.1%
 
2 1 < 0.1%
 
3 1 < 0.1%
 
4 1 < 0.1%
 
5 1 < 0.1%
 
ValueCountFrequency (%) 
10000 1 < 0.1%
 
9999 1 < 0.1%
 
9998 1 < 0.1%
 
9997 1 < 0.1%
 
9996 1 < 0.1%
 

is_bad
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
0
8705
1
 
1295
ValueCountFrequency (%) 
0 8705 87.1%
 
1 1295 13.0%
 

emp_title
Categorical

HIGH CARDINALITY
MISSING
UNIFORM
Distinct count8183
Unique (%)87.0%
Missing592
Missing (%)5.9%
Memory size78.2 KiB
US Army
 
37
Bank of America
 
23
IBM
 
22
USAF
 
17
US Navy
 
17
Other values (8178)
9292
ValueCountFrequency (%) 
US Army 37 0.4%
 
Bank of America 23 0.2%
 
IBM 22 0.2%
 
USAF 17 0.2%
 
US Navy 17 0.2%
 
United States Air Force 16 0.2%
 
Wells Fargo 15 0.1%
 
AT&T 14 0.1%
 
U.S. Army 14 0.1%
 
Self Employed 14 0.1%
 
Other values (8173) 9219 92.2%
 
(Missing) 592 5.9%
 

Length

Max length78
Mean length17.4252
Min length2
ValueCountFrequency (%) 
Uppercase_Letter 29 32.6%
 
Lowercase_Letter 26 29.2%
 
Other_Punctuation 11 12.4%
 
Decimal_Number 10 11.2%
 
Open_Punctuation 2 2.2%
 
Close_Punctuation 1 1.1%
 
Final_Punctuation 1 1.1%
 
Other_Symbol 1 1.1%
 
Math_Symbol 1 1.1%
 
Space_Separator 1 1.1%
 
Other values (6) 6 6.7%
 
ValueCountFrequency (%) 
Latin 55 61.8%
 
Common 34 38.2%
 
ValueCountFrequency (%) 
ASCII 81 97.6%
 
Punctuation 2 2.4%
 

emp_length
Categorical

Distinct count14
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
10
2160
1
2083
2
1183
3
1010
4
889
Other values (9)
2675
ValueCountFrequency (%) 
10 2160 21.6%
 
1 2083 20.8%
 
2 1183 11.8%
 
3 1010 10.1%
 
4 889 8.9%
 
5 779 7.8%
 
6 535 5.3%
 
7 421 4.2%
 
8 351 3.5%
 
9 331 3.3%
 
Other values (4) 258 2.6%
 

Length

Max length2
Mean length1.2418
Min length1
ValueCountFrequency (%) 
Decimal_Number 10 83.3%
 
Lowercase_Letter 2 16.7%
 
ValueCountFrequency (%) 
Common 10 83.3%
 
Latin 2 16.7%
 
ValueCountFrequency (%) 
ASCII 12 100.0%
 

home_ownership
Categorical

Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
RENT
4745
MORTGAGE
4445
OWN
 
775
OTHER
 
34
NONE
 
1
ValueCountFrequency (%) 
RENT 4745 47.4%
 
MORTGAGE 4445 44.5%
 
OWN 775 7.8%
 
OTHER 34 0.3%
 
NONE 1 < 0.1%
 

Length

Max length8
Mean length5.7039
Min length3
ValueCountFrequency (%) 
Uppercase_Letter 10 100.0%
 
ValueCountFrequency (%) 
Latin 10 100.0%
 
ValueCountFrequency (%) 
ASCII 10 100.0%
 

annual_inc
Real number (ℝ≥0)

Distinct count1901
Unique (%)19.0%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean68203.01154
Minimum2000
Maximum900000
Zeros0
Zeros (%)0.0%
Memory size78.2 KiB

Quantile statistics

Minimum2000
5-th percentile23734
Q140000
median58000
Q382000
95-th percentile143550
Maximum900000
Range898000
Interquartile range (IQR)42000

Descriptive statistics

Standard deviation48590.25276
Coefficient of variation (CV)0.7124355899
Kurtosis51.15309953
Mean68203.01154
Median Absolute Deviation (MAD)30247.19103
Skewness4.880305421
Sum681961912.4
Variance2361012663
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
60000 381 3.8%
 
50000 267 2.7%
 
40000 222 2.2%
 
75000 213 2.1%
 
30000 211 2.1%
 
65000 204 2.0%
 
48000 196 2.0%
 
70000 193 1.9%
 
45000 181 1.8%
 
80000 170 1.7%
 
Other values (1891) 7761 77.6%
 
ValueCountFrequency (%) 
2000 1 < 0.1%
 
4080 1 < 0.1%
 
4200 2 < 0.1%
 
4800 2 < 0.1%
 
5000 2 < 0.1%
 
ValueCountFrequency (%) 
900000 2 < 0.1%
 
860000 1 < 0.1%
 
780000 1 < 0.1%
 
744000 1 < 0.1%
 
725000 1 < 0.1%
 
Distinct count3
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
not verified
4367
VERIFIED - income
3214
VERIFIED - income source
2419
ValueCountFrequency (%) 
not verified 4367 43.7%
 
VERIFIED - income 3214 32.1%
 
VERIFIED - income source 2419 24.2%
 

Length

Max length24
Mean length16.5098
Min length12
ValueCountFrequency (%) 
Lowercase_Letter 13 61.9%
 
Uppercase_Letter 6 28.6%
 
Dash_Punctuation 1 4.8%
 
Space_Separator 1 4.8%
 
ValueCountFrequency (%) 
Latin 19 90.5%
 
Common 2 9.5%
 
ValueCountFrequency (%) 
ASCII 21 100.0%
 

pymnt_plan
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
n
9998
y
 
2
ValueCountFrequency (%) 
n 9998 > 99.9%
 
y 2 < 0.1%
 

Notes
Categorical

HIGH CARDINALITY
MISSING
UNIFORM
Distinct count6761
Unique (%)99.9%
Missing3230
Missing (%)32.3%
Memory size78.2 KiB
Personal Loan
 
3
Debt Consolidation
 
3
Camping Membership
 
2
refinancing
 
2
I am consolidating credit card debt.
 
2
Other values (6756)
6758
ValueCountFrequency (%) 
Personal Loan 3 < 0.1%
 
Debt Consolidation 3 < 0.1%
 
Camping Membership 2 < 0.1%
 
refinancing 2 < 0.1%
 
I am consolidating credit card debt. 2 < 0.1%
 
I am a recent college graduate that is in need to pay down high interest credit card debt. I had to pay my own way through college and have student loans and credit card debt to show for it. I now have a good paying full time job and would like to pay down the high interest credit card debt that I have for a better financial future. 2 < 0.1%
 
This loan would be to consolidate my credit card debts, and have one payment at a reasonable interest rate. 2 < 0.1%
 
Borrower added on 07/14/11 > I am consolidating all of my debt into one payment so I am able to get debt free in 36 months. I can get you a list of the debts that will be paid if needed. Plan on getting this paid in 36 months and then look for a new house for the family.<br/> 1 < 0.1%
 
Borrower added on 01/17/10 > I would like to end the seemingly never ending payments to my credit cards and cancel as many as I can once they are paid off.<br/> 1 < 0.1%
 
Borrower added on 12/04/11 > My goal is to closeout my student loan and a few other things. 2012 is a fresh start! <br> 1 < 0.1%
 
Other values (6751) 6751 67.5%
 
(Missing) 3230 32.3%
 

Length

Max length3988
Mean length291.4693
Min length3
ValueCountFrequency (%) 
Lowercase_Letter 31 27.2%
 
Uppercase_Letter 30 26.3%
 
Other_Punctuation 15 13.2%
 
Decimal_Number 10 8.8%
 
Math_Symbol 6 5.3%
 
Control 4 3.5%
 
Currency_Symbol 3 2.6%
 
Dash_Punctuation 3 2.6%
 
Initial_Punctuation 2 1.8%
 
Close_Punctuation 2 1.8%
 
Other values (6) 8 7.0%
 
ValueCountFrequency (%) 
Latin 61 53.5%
 
Common 53 46.5%
 
ValueCountFrequency (%) 
ASCII 94 95.9%
 
Punctuation 4 4.1%
 

purpose_cat
Categorical

Distinct count27
Unique (%)0.3%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
debt consolidation
4454
credit card
1273
other
1026
home improvement
 
800
major purchase
 
546
Other values (22)
1901
ValueCountFrequency (%) 
debt consolidation 4454 44.5%
 
credit card 1273 12.7%
 
other 1026 10.3%
 
home improvement 800 8.0%
 
major purchase 546 5.5%
 
small business 461 4.6%
 
car 349 3.5%
 
wedding 250 2.5%
 
medical 183 1.8%
 
moving 159 1.6%
 
Other values (17) 499 5.0%
 

Length

Max length33
Mean length13.9381
Min length3
ValueCountFrequency (%) 
Lowercase_Letter 21 95.5%
 
Space_Separator 1 4.5%
 
ValueCountFrequency (%) 
Latin 21 95.5%
 
Common 1 4.5%
 
ValueCountFrequency (%) 
ASCII 22 100.0%
 

purpose
Categorical

HIGH CARDINALITY
Distinct count5677
Unique (%)56.8%
Missing4
Missing (%)< 0.1%
Memory size78.2 KiB
Debt Consolidation
 
548
Debt Consolidation Loan
 
416
Personal Loan
 
145
Consolidation
 
132
debt consolidation
 
117
Other values (5672)
8638
ValueCountFrequency (%) 
Debt Consolidation 548 5.5%
 
Debt Consolidation Loan 416 4.2%
 
Personal Loan 145 1.5%
 
Consolidation 132 1.3%
 
debt consolidation 117 1.2%
 
Home Improvement 108 1.1%
 
Small Business Loan 95 0.9%
 
Personal 92 0.9%
 
Credit Card Consolidation 90 0.9%
 
Debt consolidation 74 0.7%
 
Other values (5667) 8179 81.8%
 

Length

Max length80
Mean length17.1821
Min length1
ValueCountFrequency (%) 
Uppercase_Letter 29 30.5%
 
Lowercase_Letter 27 28.4%
 
Other_Punctuation 17 17.9%
 
Decimal_Number 10 10.5%
 
Math_Symbol 3 3.2%
 
Open_Punctuation 2 2.1%
 
Currency_Symbol 2 2.1%
 
Dash_Punctuation 1 1.1%
 
Connector_Punctuation 1 1.1%
 
Space_Separator 1 1.1%
 
Other values (2) 2 2.1%
 
ValueCountFrequency (%) 
Latin 56 58.9%
 
Common 39 41.1%
 
ValueCountFrequency (%) 
ASCII 87 96.7%
 
Punctuation 3 3.3%
 

zip_code
Categorical

HIGH CARDINALITY
Distinct count720
Unique (%)7.2%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
100xx
 
158
112xx
 
141
945xx
 
129
070xx
 
125
606xx
 
114
Other values (715)
9333
ValueCountFrequency (%) 
100xx 158 1.6%
 
112xx 141 1.4%
 
945xx 129 1.3%
 
070xx 125 1.2%
 
606xx 114 1.1%
 
900xx 107 1.1%
 
021xx 99 1.0%
 
941xx 95 0.9%
 
926xx 94 0.9%
 
331xx 93 0.9%
 
Other values (710) 8845 88.4%
 

Length

Max length5
Mean length5
Min length5
ValueCountFrequency (%) 
Decimal_Number 10 90.9%
 
Lowercase_Letter 1 9.1%
 
ValueCountFrequency (%) 
Common 10 90.9%
 
Latin 1 9.1%
 
ValueCountFrequency (%) 
ASCII 11 100.0%
 

addr_state
Categorical

Distinct count50
Unique (%)0.5%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
CA
1748
NY
 
958
FL
 
714
TX
 
700
NJ
 
482
Other values (45)
5398
ValueCountFrequency (%) 
CA 1748 17.5%
 
NY 958 9.6%
 
FL 714 7.1%
 
TX 700 7.0%
 
NJ 482 4.8%
 
VA 392 3.9%
 
IL 386 3.9%
 
PA 378 3.8%
 
GA 357 3.6%
 
MA 331 3.3%
 
Other values (40) 3554 35.5%
 

Length

Max length2
Mean length2
Min length2
ValueCountFrequency (%) 
Uppercase_Letter 24 100.0%
 
ValueCountFrequency (%) 
Latin 24 100.0%
 
ValueCountFrequency (%) 
ASCII 24 100.0%
 

debt_to_income
Real number (ℝ≥0)

Distinct count2585
Unique (%)25.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.338704
Minimum0
Maximum29.99
Zeros58
Zeros (%)0.6%
Memory size78.2 KiB

Quantile statistics

Minimum0
5-th percentile2.129
Q18.16
median13.41
Q318.6925
95-th percentile23.93
Maximum29.99
Range29.99
Interquartile range (IQR)10.5325

Descriptive statistics

Standard deviation6.754211507
Coefficient of variation (CV)0.5063619004
Kurtosis-0.8546793248
Mean13.338704
Median Absolute Deviation (MAD)5.669516109
Skewness-0.008777611376
Sum133387.04
Variance45.61937308
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 0. 0.055 0.195 3.395 7.675 20.325 22.835 24.965 26.885 29.99 ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 58 0.6%
 
12.48 16 0.2%
 
13.51 13 0.1%
 
10 13 0.1%
 
19.2 13 0.1%
 
18.14 13 0.1%
 
4.8 12 0.1%
 
17.82 12 0.1%
 
15.38 12 0.1%
 
22.43 12 0.1%
 
Other values (2575) 9826 98.3%
 
ValueCountFrequency (%) 
0 58 0.6%
 
0.11 1 < 0.1%
 
0.12 1 < 0.1%
 
0.13 1 < 0.1%
 
0.14 2 < 0.1%
 
ValueCountFrequency (%) 
29.99 1 < 0.1%
 
29.93 1 < 0.1%
 
29.92 1 < 0.1%
 
29.83 1 < 0.1%
 
29.74 1 < 0.1%
 

delinq_2yrs
Real number (ℝ≥0)

ZEROS
Distinct count10
Unique (%)0.1%
Missing5
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean0.148174087
Minimum0
Maximum11
Zeros8910
Zeros (%)89.1%
Memory size78.2 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum11
Range11
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.5062698917
Coefficient of variation (CV)3.416723543
Kurtosis54.81013986
Mean0.148174087
Median Absolute Deviation (MAD)0.2641783123
Skewness5.639317112
Sum1481
Variance0.2563092032
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0 8910 89.1%
 
1 822 8.2%
 
2 186 1.9%
 
3 50 0.5%
 
4 14 0.1%
 
5 6 0.1%
 
6 3 < 0.1%
 
7 2 < 0.1%
 
11 1 < 0.1%
 
8 1 < 0.1%
 
(Missing) 5 0.1%
 
ValueCountFrequency (%) 
0 8910 89.1%
 
1 822 8.2%
 
2 186 1.9%
 
3 50 0.5%
 
4 14 0.1%
 
ValueCountFrequency (%) 
11 1 < 0.1%
 
8 1 < 0.1%
 
7 2 < 0.1%
 
6 3 < 0.1%
 
5 6 0.1%
 

earliest_cr_line
Categorical

HIGH CARDINALITY
Distinct count463
Unique (%)4.6%
Missing5
Missing (%)< 0.1%
Memory size78.2 KiB
11/1/98
 
95
11/1/00
 
95
12/1/98
 
91
10/1/98
 
86
10/1/01
 
84
Other values (458)
9544
ValueCountFrequency (%) 
11/1/98 95 0.9%
 
11/1/00 95 0.9%
 
12/1/98 91 0.9%
 
10/1/98 86 0.9%
 
10/1/01 84 0.8%
 
10/1/99 84 0.8%
 
10/1/00 83 0.8%
 
7/1/00 82 0.8%
 
11/1/97 80 0.8%
 
11/1/96 80 0.8%
 
Other values (453) 9135 91.3%
 

Length

Max length7
Mean length6.3008
Min length3
ValueCountFrequency (%) 
Decimal_Number 10 76.9%
 
Lowercase_Letter 2 15.4%
 
Other_Punctuation 1 7.7%
 
ValueCountFrequency (%) 
Common 11 84.6%
 
Latin 2 15.4%
 
ValueCountFrequency (%) 
ASCII 13 100.0%
 

inq_last_6mths
Real number (ℝ≥0)

ZEROS
Distinct count20
Unique (%)0.2%
Missing5
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean1.066933467
Minimum0
Maximum25
Zeros4602
Zeros (%)46.0%
Memory size78.2 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile4
Maximum25
Range25
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.47605196
Coefficient of variation (CV)1.383452676
Kurtosis23.67847049
Mean1.066933467
Median Absolute Deviation (MAD)1.01844467
Skewness3.116059024
Sum10664
Variance2.178729389
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0 4602 46.0%
 
1 2684 26.8%
 
2 1431 14.3%
 
3 731 7.3%
 
4 227 2.3%
 
5 152 1.5%
 
6 76 0.8%
 
7 42 0.4%
 
8 27 0.3%
 
9 10 0.1%
 
Other values (10) 13 0.1%
 
(Missing) 5 0.1%
 
ValueCountFrequency (%) 
0 4602 46.0%
 
1 2684 26.8%
 
2 1431 14.3%
 
3 731 7.3%
 
4 227 2.3%
 
ValueCountFrequency (%) 
25 1 < 0.1%
 
24 1 < 0.1%
 
18 2 < 0.1%
 
17 1 < 0.1%
 
16 1 < 0.1%
 

mths_since_last_delinq
Real number (ℝ≥0)

MISSING
ZEROS
Distinct count91
Unique (%)2.5%
Missing6316
Missing (%)63.2%
Infinite0
Infinite (%)0.0%
Mean35.89033659
Minimum0
Maximum120
Zeros163
Zeros (%)1.6%
Memory size78.2 KiB

Quantile statistics

Minimum0
5-th percentile2
Q118
median34
Q353
95-th percentile75
Maximum120
Range120
Interquartile range (IQR)35

Descriptive statistics

Standard deviation22.3614429
Coefficient of variation (CV)0.6230491276
Kurtosis-0.8171447741
Mean35.89033659
Median Absolute Deviation (MAD)18.73116249
Skewness0.2929154934
Sum132220
Variance500.0341287
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0 163 1.6%
 
30 69 0.7%
 
34 66 0.7%
 
38 65 0.7%
 
23 65 0.7%
 
44 64 0.6%
 
24 64 0.6%
 
33 63 0.6%
 
20 63 0.6%
 
18 61 0.6%
 
Other values (81) 2941 29.4%
 
(Missing) 6316 63.2%
 
ValueCountFrequency (%) 
0 163 1.6%
 
1 6 0.1%
 
2 29 0.3%
 
3 40 0.4%
 
4 37 0.4%
 
ValueCountFrequency (%) 
120 1 < 0.1%
 
115 1 < 0.1%
 
97 1 < 0.1%
 
96 1 < 0.1%
 
95 1 < 0.1%
 

mths_since_last_record
Real number (ℝ≥0)

MISSING
ZEROS
Distinct count94
Unique (%)11.2%
Missing9160
Missing (%)91.6%
Infinite0
Infinite (%)0.0%
Mean61.65238095
Minimum0
Maximum119
Zeros267
Zeros (%)2.7%
Memory size78.2 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median86
Q3101
95-th percentile115.05
Maximum119
Range119
Interquartile range (IQR)101

Descriptive statistics

Standard deviation46.18961922
Coefficient of variation (CV)0.7491944108
Kurtosis-1.586694469
Mean61.65238095
Median Absolute Deviation (MAD)42.56052154
Skewness-0.3831245675
Sum51788
Variance2133.480924
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0 267 2.7%
 
89 21 0.2%
 
116 18 0.2%
 
86 17 0.2%
 
87 17 0.2%
 
92 17 0.2%
 
100 16 0.2%
 
114 16 0.2%
 
104 16 0.2%
 
105 15 0.1%
 
Other values (84) 420 4.2%
 
(Missing) 9160 91.6%
 
ValueCountFrequency (%) 
0 267 2.7%
 
6 1 < 0.1%
 
11 1 < 0.1%
 
17 1 < 0.1%
 
20 2 < 0.1%
 
ValueCountFrequency (%) 
119 3 < 0.1%
 
118 11 0.1%
 
117 10 0.1%
 
116 18 0.2%
 
115 10 0.1%
 

open_acc
Real number (ℝ≥0)

Distinct count36
Unique (%)0.4%
Missing5
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean9.334567284
Minimum1
Maximum39
Zeros0
Zeros (%)0.0%
Memory size78.2 KiB

Quantile statistics

Minimum1
5-th percentile3
Q16
median9
Q312
95-th percentile18
Maximum39
Range38
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.526589744
Coefficient of variation (CV)0.4849276465
Kurtosis1.838467994
Mean9.334567284
Median Absolute Deviation (MAD)3.516796938
Skewness1.063599744
Sum93299
Variance20.49001471
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
7 1035 10.3%
 
6 990 9.9%
 
8 937 9.4%
 
9 929 9.3%
 
10 805 8.1%
 
5 763 7.6%
 
11 692 6.9%
 
4 631 6.3%
 
12 577 5.8%
 
13 487 4.9%
 
Other values (26) 2149 21.5%
 
ValueCountFrequency (%) 
1 7 0.1%
 
2 163 1.6%
 
3 374 3.7%
 
4 631 6.3%
 
5 763 7.6%
 
ValueCountFrequency (%) 
39 1 < 0.1%
 
36 2 < 0.1%
 
35 1 < 0.1%
 
33 3 < 0.1%
 
32 1 < 0.1%
 

pub_rec
Categorical

Distinct count4
Unique (%)< 0.1%
Missing5
Missing (%)< 0.1%
Memory size78.2 KiB
0
9422
1
 
550
2
 
18
3
 
5
ValueCountFrequency (%) 
0 9422 94.2%
 
1 550 5.5%
 
2 18 0.2%
 
3 5 0.1%
 
(Missing) 5 0.1%
 

Length

Max length3
Mean length3
Min length3
ValueCountFrequency (%) 
Decimal_Number 4 57.1%
 
Lowercase_Letter 2 28.6%
 
Other_Punctuation 1 14.3%
 
ValueCountFrequency (%) 
Common 5 71.4%
 
Latin 2 28.6%
 
ValueCountFrequency (%) 
ASCII 7 100.0%
 

revol_bal
Real number (ℝ≥0)

ZEROS
Distinct count8130
Unique (%)81.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14271.0074
Minimum0
Maximum1207359
Zeros278
Zeros (%)2.8%
Memory size78.2 KiB

Quantile statistics

Minimum0
5-th percentile277.95
Q13524.5
median8645.5
Q316952.25
95-th percentile44554.85
Maximum1207359
Range1207359
Interquartile range (IQR)13427.75

Descriptive statistics

Standard deviation25437.9082
Coefficient of variation (CV)1.782488614
Kurtosis570.4140985
Mean14271.0074
Median Absolute Deviation (MAD)11728.71446
Skewness16.32424653
Sum142710074
Variance647087173.7
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0.000000e+00 5.000000e-01 2.550000e+01 7.825000e+02 5.878000e+03 ... 8.222900e+04 1.205955e+05 1.727790e+05 2.836010e+05 1.207359e+06], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 278 2.8%
 
2227 6 0.1%
 
1763 6 0.1%
 
11628 5 0.1%
 
4801 5 0.1%
 
760 5 0.1%
 
5272 4 < 0.1%
 
18550 4 < 0.1%
 
15 4 < 0.1%
 
5220 4 < 0.1%
 
Other values (8120) 9679 96.8%
 
ValueCountFrequency (%) 
0 278 2.8%
 
1 2 < 0.1%
 
3 2 < 0.1%
 
5 1 < 0.1%
 
6 2 < 0.1%
 
ValueCountFrequency (%) 
1207359 1 < 0.1%
 
602519 1 < 0.1%
 
508961 1 < 0.1%
 
487589 1 < 0.1%
 
423189 1 < 0.1%
 

revol_util
Real number (ℝ≥0)

ZEROS
Distinct count1027
Unique (%)10.3%
Missing26
Missing (%)0.3%
Infinite0
Infinite (%)0.0%
Mean48.450771
Minimum0
Maximum100.6
Zeros254
Zeros (%)2.5%
Memory size78.2 KiB

Quantile statistics

Minimum0
5-th percentile2.8
Q125
median48.7
Q371.8
95-th percentile93.6
Maximum100.6
Range100.6
Interquartile range (IQR)46.8

Descriptive statistics

Standard deviation28.22055724
Coefficient of variation (CV)0.5824583727
Kurtosis-1.099296594
Mean48.450771
Median Absolute Deviation (MAD)24.12794997
Skewness-0.01672374423
Sum483247.99
Variance796.3998507
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0 254 2.5%
 
46.6 21 0.2%
 
43.4 20 0.2%
 
0.1 20 0.2%
 
47.6 19 0.2%
 
56.8 19 0.2%
 
55.4 19 0.2%
 
53.6 19 0.2%
 
70 19 0.2%
 
31.4 18 0.2%
 
Other values (1017) 9546 95.5%
 
(Missing) 26 0.3%
 
ValueCountFrequency (%) 
0 254 2.5%
 
0.03 1 < 0.1%
 
0.1 20 0.2%
 
0.12 1 < 0.1%
 
0.2 11 0.1%
 
ValueCountFrequency (%) 
100.6 1 < 0.1%
 
100 1 < 0.1%
 
99.9 4 < 0.1%
 
99.8 5 0.1%
 
99.7 3 < 0.1%
 

total_acc
Real number (ℝ≥0)

Distinct count75
Unique (%)0.8%
Missing5
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean22.01130565
Minimum1
Maximum90
Zeros0
Zeros (%)0.0%
Memory size78.2 KiB

Quantile statistics

Minimum1
5-th percentile6
Q113
median20
Q329
95-th percentile44
Maximum90
Range89
Interquartile range (IQR)16

Descriptive statistics

Standard deviation11.70939957
Coefficient of variation (CV)0.5319720581
Kurtosis0.9238037612
Mean22.01130565
Median Absolute Deviation (MAD)9.292220477
Skewness0.8707976619
Sum220003
Variance137.1100383
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
15 369 3.7%
 
20 360 3.6%
 
17 360 3.6%
 
12 357 3.6%
 
14 351 3.5%
 
19 346 3.5%
 
16 340 3.4%
 
18 339 3.4%
 
13 331 3.3%
 
22 329 3.3%
 
Other values (65) 6513 65.1%
 
ValueCountFrequency (%) 
1 3 < 0.1%
 
2 10 0.1%
 
3 58 0.6%
 
4 115 1.1%
 
5 144 1.4%
 
ValueCountFrequency (%) 
90 1 < 0.1%
 
81 1 < 0.1%
 
80 1 < 0.1%
 
79 1 < 0.1%
 
78 1 < 0.1%
 
Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
f
9983
m
 
17
ValueCountFrequency (%) 
f 9983 99.8%
 
m 17 0.2%
 

Length

Max length1
Mean length1
Min length1
ValueCountFrequency (%) 
Lowercase_Letter 2 100.0%
 
ValueCountFrequency (%) 
Latin 2 100.0%
 
ValueCountFrequency (%) 
ASCII 2 100.0%
 
Distinct count1
Unique (%)< 0.1%
Missing32
Missing (%)0.3%
Memory size78.2 KiB
0
9968
(Missing)
 
32
ValueCountFrequency (%) 
0 9968 99.7%
 
(Missing) 32 0.3%
 
Distinct count3
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
2
3424
3
3299
1
3277
ValueCountFrequency (%) 
2 3424 34.2%
 
3 3299 33.0%
 
1 3277 32.8%
 

Length

Max length1
Mean length1
Min length1
ValueCountFrequency (%) 
Decimal_Number 3 100.0%
 
ValueCountFrequency (%) 
Common 3 100.0%
 
ValueCountFrequency (%) 
ASCII 3 100.0%
 

policy_code
Categorical

Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
PC3
2098
PC5
2025
PC1
1978
PC2
1962
PC4
1937
ValueCountFrequency (%) 
PC3 2098 21.0%
 
PC5 2025 20.2%
 
PC1 1978 19.8%
 
PC2 1962 19.6%
 
PC4 1937 19.4%
 

Length

Max length3
Mean length3
Min length3
ValueCountFrequency (%) 
Decimal_Number 5 71.4%
 
Uppercase_Letter 2 28.6%
 
ValueCountFrequency (%) 
Common 5 71.4%
 
Latin 2 28.6%
 
ValueCountFrequency (%) 
ASCII 7 100.0%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Missing values

Sample

First rows

Idis_bademp_titleemp_lengthhome_ownershipannual_incverification_statuspymnt_planNotespurpose_catpurposezip_codeaddr_statedebt_to_incomedelinq_2yrsearliest_cr_lineinq_last_6mthsmths_since_last_delinqmths_since_last_recordopen_accpub_recrevol_balrevol_utiltotal_accinitial_list_statuscollections_12_mths_ex_medmths_since_last_major_derogpolicy_code
010Time Warner Cable10MORTGAGE50000.0not verifiednNaNmedicalMedical766xxTX10.870.012/1/920.0NaNNaN15.00.01208712.144.0f0.01PC4
120Ottawa University1RENT39216.0not verifiednBorrower added on 04/14/11 > I will be using this loan to pay off expenses accrued in the last six months on my credit cards, due to a combination of job transition, relocation for the job, and medical expenses from a broken tibula. I generally overpay my monthly minimum on my debts, so I expect that this loan will be repaid sooner than 5 years. I have a steady job working in the information technology field, I've been employed full-time in this field for over eight years, and have been with my present employer for seven months in good standing. My monthly budget breakdown is 1/3 of my paycheck going to rent and bills, 1/3 going to living and job transit expenses, and 1/3 remaining for general spending and payments.<br/>debt consolidationMy Debt Consolidation Loan660xxKS9.150.011/1/052.0NaNNaN4.00.01011464.05.0f0.02PC1
230Kennedy Wilson4RENT65000.0not verifiednNaNcredit cardAP Personal Loan916xxCA11.240.06/1/700.0NaNNaN4.00.0810.68.0f0.03PC4
340TOWN OF PLATTEKILL10MORTGAGE57500.0not verifiednNaNdebt consolidationDebt Consolidation Loan124xxNY6.181.09/1/820.016.0NaN6.00.01003037.123.0f0.02PC2
450Belmont Correctional10MORTGAGE50004.0VERIFIED - incomenI want to consolidate my debt, pay for a vacation and buy a ring.debt consolidationconsolidate439xxOH19.030.010/1/994.0NaNNaN8.00.01074040.421.0f0.03PC3
560BAE Systems4RENT47028.0VERIFIED - incomenNaNother16-Oct-10200xxDC7.832.012/1/991.019.0NaN6.00.0171526.425.0f0.03PC3
670Peninsula Counseling Center10MORTGAGE126000.0not verifiednBorrower added on 05/18/10 > mick credit card consolidation loan - 100% payoff of credit card debt - amex, sears, macys and bank of america<br/>credit cardmick credit card loan103xxNY14.280.011/1/790.0NaNNaN18.00.0546611.129.0f0.03PC1
780Health Plan of Nevada6MORTGAGE42000.0VERIFIED - income sourcenBorrower added on 11/29/11 > Loan is for debt consolidation and will be paid timely. Employed in the healthcare industry for 6 years since moving to NV 7 years ago and have always had stable job positions. Thank you very much for your assistance.<br>debt consolidationCC loan891xxNV10.290.04/1/060.0NaNNaN9.00.01035495.910.0f0.03PC3
890John Deere2MORTGAGE50000.0VERIFIED - incomenNaNdebt consolidationConsolidation612xxIL15.360.02/1/012.0NaNNaN11.00.01966259.227.0f0.01PC5
9100NaN1RENT40000.0not verifiednThis loan would be for a 2006 PT Cruiser with only 300 miles on it. There is still a full warranty till Dec. 2009 in effect.carFICO score 762 want's to buy a new car926xxCA6.480.05/1/951.0NaNNaN11.00.01999818.323.0f0.01PC5

Last rows

Idis_bademp_titleemp_lengthhome_ownershipannual_incverification_statuspymnt_planNotespurpose_catpurposezip_codeaddr_statedebt_to_incomedelinq_2yrsearliest_cr_lineinq_last_6mthsmths_since_last_delinqmths_since_last_recordopen_accpub_recrevol_balrevol_utiltotal_accinitial_list_statuscollections_12_mths_ex_medmths_since_last_major_derogpolicy_code
999099910Konica Minolta10MORTGAGE120000.0VERIFIED - incomenI am looking ofr a loan so that I can replace my septic system.home improvementHome Improvment481xxMI14.441.02/1/940.04.0NaN14.00.01471659.831.0f0.02PC2
999199920Ametek Aerospace and Defense10RENT63000.0VERIFIED - income sourcenBorrower added on 07/09/10 > loan app completed<br/>email verified<br/>bank account verified<br/>medicalLasik018xxMA10.080.05/1/890.0NaNNaN6.00.0601.122.0f0.03PC1
999299930the reis group10RENT52000.0VERIFIED - incomenBorrower added on 12/14/11 > looking to be debt free in 3 yrs or less!!<br>debt consolidationconsolidation124xxNY23.700.08/1/980.070.0NaN8.00.01500291.518.0f0.02PC5
999399940Astoria Fuel Corp.10OWN95892.0VERIFIED - incomenI live in a family owned home. It is my parents, but I am allowed to live here as long as I want as long as I pay for the taxes and any home improvements the home needs. I am looking for a loan to add a bathroom on the second floor and finish other small home improvements the house currently needs. I have lived here for 5 years and have done many updates already. This is the first major renovation I am doing on the house. I need the loan to get what I cannot do myself done the right way. I have excellent credit and pride myself on that.home improvementUpdates Needed on Family owned home110xxNY8.700.07/1/952.0NaNNaN3.00.0213930.67.0f0.03PC5
999499951Guitar Center1RENT24996.0VERIFIED - income sourcenNaNdebt consolidationPersonal Loan913xxCA3.790.08/1/050.0NaNNaN2.00.0480156.57.0f0.01PC1
999599960Cabot5MORTGAGE66250.0VERIFIED - incomenNaNweddingScottish Wedding014xxMA9.400.09/1/011.0NaNNaN8.00.0365624.110.0f0.02PC3
999699970Gallant & Wein1RENT26000.0VERIFIED - income sourcenBorrower added on 08/30/11 > credit cards consolidation and doctors bills..<br/>debt consolidationdebt112xxNY20.490.05/1/001.079.0NaN8.00.0670958.912.0f0.02PC3
999799980Weichert, Realtors8RENT47831.0not verifiednBorrower added on 03/10/10 > My dream is to finally end the cycle of revolving debt so that I can finally build a stable future for myself. I've been able to put some hardships behind and can see the light ahead if my loan is fully funded. I live modestly but cannot corral rising life expenses unless I can put away credit debt and service this loan. I will be a worthy LendingClub loan recipient. Thank you for your consideration!<br/>debt consolidationHarnessing credit debt for a stable future.070xxNJ24.130.012/1/890.0NaN111.09.01.01134660.717.0f0.03PC3
999899990meadwestvaco6MORTGAGE70000.0not verifiednNaNmajor purchasepersonal244xxVA16.182.03/1/992.016.0NaN9.00.01715750.927.0f0.02PC3
9999100000Rehab Alliance1RENT70560.0not verifiednBorrower added on 11/09/11 > order to pay back lenders quicker. Also, never been late on a payment. Job: Very stable, full-time job (40 hours/wk). Thank you!<br>credit cardCredit Card Loan900xxCA16.130.09/1/001.053.0NaN15.00.0230422.634.0f0.02PC5